Abstract: The tremendous growth in data has immensely impacted organizations. Their infrastructure and traditional data management systems are unable to handle Big Data. They have to either invest heavily on their infrastructure or move their Big Data analytics to Cloud where they can benefit from both on-demand scalability and contemporary data management techniques. However, to make Cloud hosted Big Data analytics available to wider range of enterprises, we have to carefully capture their preferences in terms of budget and service level objectives. Therefore, in this study we propose SLA driven resource provisioning and scheduling in multiple data centre environment. The user requests in terms of SLA (deadline and budget) are captured at an entry point from where user request user information is sent to cloud provider. The cloud provider receives SLA constraints and user’s job details, checks all the data centres for availability of resources and decide the data centre at which the user application can be deployed without violating the SLA and budget constraints. Further, a pruned tree based scheduling algorithm is used to provision cloud resources and schedule the tasks
Keywords: Big data, Cloud Computing, Service level Agreement, multiple data centres, deadline, budget, scheduling, pruned tree